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Abstract 

The purpose of this research was to analyze recent statistics textbooks in the behavioral sciences 
in terms of their coverage of exploratory data analysis (EDA) philosophy and techniques. 

Twenty popular texts were analyzed. EDA philosophy was not addressed in the vast majority of 
texts. Only three texts had an entire chapter on EDA. None of the authors used the term 
“confirmatory data analysis” or discussed model building or cross-validation. Seven texts 
contained references to published work by Tukey, but these references were mainly for specific 
techniques, most typically the stem-and-leaf display and box-and-whiskers plot, which were 
presented in 15 and 9 texts, respectively. The paper ends with recommendations for integrating 
EDA into the fields of psychology and education. 
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Whatever Happened to Exploratory Data Analysis? An Analysis of Behavioral Science Statistics 

Textbooks 

The vast majority of research conducted in the behavioral sciences is based on a 
confirmatory data analysis (CDA) model. In CD A, the researcher states a null and alternative 
hypothesis, designs a study, collects data, analyzes the data using one or more statistical 
significance test, and then makes a decision about whether or not to reject the null hypothesis. 

To help interpret the research results, the researcher usually considers summary statistics such as 
means, standard deviations, and effect size indices. He or she might also consider confidence 
interval estimates, although this occurs less frequently, and might also generate summary graphs 
such as bar charts and histograms. 

Since the early 1960s, John Tukey and his colleagues have written about the limitations 
associated with carrying out CDA without first conducting exploratory data analysis (EDA). 
Tukey and his colleagues have argued that by relying solely on CDA, too much trust is placed on 
statistical summaries of the data, which may, in fact, hide or misrepresent important aspects the 
data. The Anglo Saxon jurisprudence model has often been used as a metaphor for describing 
the differences between CDA and EDA. EDA has been likened to the investigative aspects of a 
legal case (i.e. the “detective work”), whereas CDA has been likened to a jury trial, where the 
goal is to make a decision about guilt or innocence. (See Hoaglin, Mosteller, and Tukey, 1983; 
Mosteller and Tukey 1977; Tukey, 1977). 

In 1977, Tukey published his classic textbook entitled Exploratory Data Analysis. In this 
textbook and in other writings, Tukey described EDA as attitude towards data analysis in which 
the researcher is skeptical of statistical summaries and open to finding patterns in the data that 
might not have been anticipated. Tukey described EDA as an iterative model-building process 
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that involves examining data from multiple perspectives; developing tentative hypotheses; re- 
analyzing the data with these tentative hypotheses in mind; revising hypotheses; and so forth. 
Ultimately, the exploratory data analyst is looking for a model that will explain the data. This 
tentative model can then be cross- validated with a second sample using CDA techniques. Tukey 
and his colleagues have argued that CDA should always be preceded by EDA. As Tukey stated, 
“exploratory data analysis can never be the whole story, but nothing else can serve as the 
foundation stone — as the first step” (Tukey, 1997, p. 3). 

Tukey developed a number of techniques for carrying out EDA. He was a strong 
advocate of analyzing graphical displays, especially those that include information on individual 
data points, and he developed the stem-and-leaf display and box-and-whiskers plot to this end. 
Tukey was also a strong advocate of examining “resistant indicators”— a term he coined— which 
are statistical summaries that are less impacted by extremes in the data (such as the median and 
inter-quartile range). 

It has been over 25 years since Exploratory Data Analysis was published (Tukey, 1977). 
Since that time, a number of leading methodologists and statisticians in the behavioral sciences 
have been strong advocates of integrating EDA into the behavioral science research process (see, 
for example, Behrens, 1997; Cohen, 1990, 1994). For example, in his article on the use of EDA 
in the field of psychology, Behrens argued: 

EDA should be recognized as an important aspect of data analysis whose conduct and 
publication are valued. By admitting EDA as an acceptable set of procedures, 
researchers can avoid the improper use of CDA techniques for the purpose of data 
exploration. As long as EDA remains a covert activity, researchers will continue to 
improperly use CDA for data exploration through model underspecification and 
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overtesting. An increase in EDA will focus more resources at the preliminary stages of 
investigations and less at the advanced stages. In doing so, the number of irreproducible 
results may be reduced by the substitution of adequate model building for the cataloging 
of significant effects. Further, the detail in modeling afforded by EDA may improve our 
understanding of phenomenon otherwise hidden behind simple summary statistics and 
tests ... (p. 154) 

What impact have Tukey and his colleagues had on the way in which data analysis is 
taught in the behavioral sciences? Exploratory data analysts would argue that EDA should 
always be a part of statistical training, not as a topic which is taught separately in advanced 
graduate-level courses, but as an integral component of any basic statistics course, including 
undergraduate courses. But to what extent is this actually occurring? For better or worse, the 
content of behavioral science statistics textbooks will determine the curriculum of most statistics 
classes. The purpose of this study was therefore to conduct a content analysis of current 
statistics textbooks in the behavioral sciences to evaluate the extent to which EDA philosophy, 
general heuristics, and specific techniques are covered in these textbooks. 

Background 

Exploratory Data Analysis 

A detailed description of EDA is beyond the scope of this paper. The interested reader is 
referred to Tukey (1977), Mosteller and Tukey (1977), and Hoaglin, Mosteller, and Tukey (1983, 
1985) for more thorough coverage. Excellent summaries of EDA can be found in Behrens 
(1997), Hartwig and Dearing (1979), and Leinhardt and Leinhardt (1980). 

As mentioned, Tukey and his colleagues often compared the data analysis process to a 
jurisprudence model. As Behrens (1997) stated; 
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in EDA, the goal is not to draw conclusions regarding guilt or innocence but rather to 
investigate the actors, generate hunches, and provide preliminary evidence. EDA is more 
like an interrogation in which clean and corrupted stories are told, whereas CDA is 
testimony regarding evidence that fits carefully laid-out trial procedures. The goal of 
EDA is indictment; the goal of CDA is conviction, (p. 133) 

EDA typically begins by examining each variable individually, combing through the 
data, checking the shapes of distributions, looking for outliers and rogue values. Then the 
exploratory data analyst turns to looking at relationships between pairs of variables, and finally 
considers multivariate relationships. Tukey (1977) developed a number of techniques to aid in 
EDA, but he repeatedly emphasized that the use of these techniques alone does not constitute 
ED A Instead he argued that EDA is an attitude, and the techniques are simply tools. 

Resistant indicators. Tukey was an advocate of using what he termed “resistant 
indicators” in data analysis. He coined the term the “five-number summary” of a 
distribution — namely the minimum value, first quartile (Qi), median, third quartile (Q3), and 
maximum value. (Actually, Tukey proposed statistics that he called “hinges”, which are usually 
but not always equal to Qi and Q3). Tukey discussed a variety of resistant indicators in addition 
to the five-number summary, but their presentation is beyond the scope of this article. 

Graphical displays for univariate analyses. Tukey (1977) developed a number of 
graphical displays for EDA, but is probably best known for the stem-and-leaf plot and the box- 
and-whiskers plot. Stem-and-leaf displays are similar to histograms, except that they display 
each individual data point, and are useful for examining the shape of a distribution as well as for 
looking for rogue values. Box-and-whiskers plots, also known as box plots, do not display each 
individual data point, but instead display the five-number summary as well as “outside values”. 



Exploratory Data Analysis 



7 



Outside values. Tukey termed the difference between the hinges “H-spread”, which is 
usually, but not always, equivalent to the inter-quartile range. He proposed a rule of thumb for 
identifying “outside” values based on this spread. In particular, Tukey proposed calculating 1.5 
times the H-spread. Outside values are those values that are either (a) below the lower hinge 
minus 1.5 times the H-spread, or (b) above the upper hinge plus 1.5 times H-spread. These 
outside values are individually plotted in box plots. 

Re-expression and smoothing. Tukey was also an advocate of what he called “re- 
expression” (i.e. monotonic nonlinear transformations) to the extent that the re-expression makes 
the data easier to understand (Tukey, 1977). These re-expressions include the log 
transformation, square root transformation, and reciprocal transformation. Tukey was also an 
advocate of applying smoothing techniques (i.e. kernel density smoothing), where some of the 
“rough” of the data is removed to get a better picture of a shape of a distribution (Tukey, 1977). 

Examining bivariate relationships. Tukey proposed a variety of EDA techniques for 
examining the relationship between pairs of variables (Tukey, 1977). In terms of examining the 
relationship between two quantitative variables, Tukey advocated analyzing simple scatter 
diagrams, and proposed a technique for fitting a “resistant” line through the data points. (This 
line is sometimes referred to as a “Tukey line”, and is an alternative to the ordinary least squares 
regression line). Tukey and his colleagues have written extensively on how to examine residuals 
in this context. 

Studies on EDA in Behavioral Science Research 

There are no empirical studies on the attitudes of researchers in the behavioral sciences 
towards EDA, or on the extent to which EDA is actually being used in behavioral science 
research. The only research that provides some insight into this issue is the research on the 
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statistics training of doctoral and undergraduate students in psychology. In terms of doctoral 
students, Aiken, West, Sechrest, and Reno (1990) found that 89% of the 186 departments they 
studied offered an introductory graduate statistics sequence, and of the departments offering this 
sequence, 77% were one year long. As mentioned, Aiken et al. found that 20% of introductory 
statistics sequence courses covered EDA in-depth. Furthermore, in only 15% of the departments 
were most or all doctoral students judged to be competent in modem graphical data display, and 
in only 8% of the departments were most or all doctoral students judged to be competent in the 
detection and treatment of influential data. 

Friedrich, Buday, and Kerr (2000) conducted a similar study of undergraduate 
psychology majors. These researchers surveyed 185 “general” and 55 “elite” psychology 
programs, and found that a statistics course was required in 93% of the programs, either 
integrated with a research methods course (26%) or given as a stand-alone course (86%). 

Specific information on EDA training was not collected, but these researchers did find that in 
over half of the programs, one hour or less of introductory statistics class time was devoted to 
graphical analysis of data (e.g. box and whisker and residual plots). 

Research on Evaluating Statistics Textbooks 

There are many reviews of individual statistics textbooks. Journal such Journal of 
Educational and Behavioral Statistics and Journal of the American Statistical Association 
routinely publish reviews of new and revised statistics textbooks. But there have been few 
systematic evaluations of statistics textbooks across one or more dimensions (Harwell, et. al., 
1996), and only one that specifically considered EDA. Cobb (1987) evaluated 16 statistics 
textbooks on a variety of dimensions, including EDA. He distinguished between EDA 
techniques— such as stem-and-leaf diagrams and box-and- whiskers plots— and EDA attitudes— 
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which he described as attending to issues such as residuals, outliers, and transformations. Less 
than half of the books in his sample covered EDA attitudes. In terms of techniques, four texts 
had no coverage; four texts had one to four pages of coverage, and seven texts had more than 
four pages of coverage. 

Unfortunately, it is difficult to draw inferences from Cobb’s (1987) research to 
behavioral science research. First, the textbooks he evaluated were not written specifically for 
the behavioral sciences, but were instead general statistics textbooks. Second, his study is now 
15 years old, so the findings might be dated. 

Specific Research Interests 

This study had three parts. First, the extent to which statistics textbook authors described 
the underlying philosophy and general heuristics of EDA was examined. Secondly, the extent to 
which statistics textbook authors presented data analytic tools commonly associated with EDA, 
including basic tools such as stem-and-leaf displays, dot charts, box-and-whiskers charts, and 
residual analysis, was studied. Finally, the extent to which the authors integrated EDA practices 
throughout their textbook was considered. 

Methods 

Sample. There were two main criteria for selecting textbooks. First, the textbook needed 
to be recent, which was defined as a publication date of 1998 or later. Second, the textbook 
needed to be written specifically for behavioral science audiences. There were 20 textbooks in 
the final sample. Fifteen of these textbooks were obtained based on information provided by the 
Faculty Online website r http://www.facultyonline.comy This site provides information on top 
selling textbooks, based upon sales information from university bookstores. 

It seemed possible that a statistics textbook could be a top-seller because it is a “bare- 




1 



0 



Exploratory Data Analysis 



10 



bones” text that is targeted towards a less-challenging statistics course, and that the sample 
textbooks might not include more rigorous introductory statistics textbooks. In order to be sure 
that more thorough texts were included, the publishers of the texts listed above were contacted 
and asked for their recommendations. Five more texts were added to the sample in this way. 

The final sample consisted of 20 textbooks — 15 from the Faculty Online website, namely 
Aron and Aron (1999); Frankfort-Nachmias and Leon-Guerrero (2000); Gravetter and Wallnau 
(2002); Healey (2002); Heiman (2000); Howell (2002); Hurlburt (1998); Jaccard and Becker 
(2002); Kiess (2002); Levin and Fox (2000); McCall (2001); Pagano (2001); Runyon, Coleman, 
and Pittenger (2000); Welkowitz, Ewen, and Cohen (2000); Witte and Witte (2001), and 5 
recommended by sales representatives, namely Abrami, Cholmsky, and Gordon (2001); Bartz 
(1999); Hinkle, Wiersma, and Jurs (2001); Sprinthall (2000); and Thorndike and Dinnel (2001), 

Results 

Summary statistics of various EDA-related indicators are presented in Table 1, 

Insert Table 1 about here. 



EDA Philosophy 

The underlying EDA philosophy was not addressed in the vast majority of textbooks. 
Only three textbooks contained an entire chapter devoted to EDA philosophy and techniques, 
and none of the remaining textbooks even had a major section devoted to EDA, Seven textbooks 
contained references to published work by Tukey and his colleagues, but in most of these 
textbooks, Tukey ’s work was cited as a reference for one or more specific techniques, rather than 
for an understanding EDA philosophy. And although all of the textbooks covered statistical 
significance testing in great detail, the term “confirmatory data analysis” was never used in any 
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textbook. Finally, the concept of cross-validation — which would be a critical issue in model- 
building with EDA— was not discussed in any textbook. 

To get a rough indication of the extent to which EDA had been integrated throughout the 
textbooks, the sections of the textbooks that covered the matched-pair r-test or two-group r-test 
were analyzed in detail. Every illustration of one of these t-tests using actual data was coded in 
terms of whether or not the textbook author(s) first presented a graphical display of the data 
being analyzed. Thus, the question was simply whether the data were looked at prior to analysis. 
Only one textbook author presented an accompanying graphical display for these t-tests. As a 
further gauge of the extent to which EDA had been integrated, the end-of-chapter assignments 
that involved these t-tests were examined. Again, only one textbook required students to 
construct a graph prior to computing one of these t-tests. Thus, overall, there was virtually no 
integration of EDA principles across the chapters of these textbooks. 

EDA Techniques 

Graphical displays. For this analysis, the frequency of three basic EDA-related graphical 
displays was considered, namely Tukey’s stem-and-leaf display, dot plot, and box-and-whiskers 
plot. The stem-and-leaf display and box-and-whiskers plot were presented in 15 and 9 texts, 
respectively, and in most of the textbooks that covered these displays, the authors attributed the 
origins of the stem-and-leaf display and box-and-whiskers plot to Tukey. In terms of using these 
displays to compare the distributions of two or more groups, the extension of the stem-and-leaf 
display was only shown in 8 of the 15 textbooks, whereas and box-and-whiskers plot for 
comparing groups was shown in 7 of the 9 textbooks. Tukey’s dot plot was not presented in any 
of the textbooks. 

Outside values and the box-and-whiskers plot. One useful aspect of the box-and- 
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whiskers plot is that individual data points for “outside” values can be plotted outside the bounds 
of the whiskers, as described earlier. In seven of the nine textbooks where the box-and-whiskers 
plot was presented, individual outside values were illustrated in the plots, although in one of 
these seven texts, the outside values were determined based on standard scores rather than 
Tukey’s procedure. Of the remaining six textbooks that did use Tukey’s procedure, the authors 
of two texts never explained how this was done. Thus, even though nearly half of the textbooks 
included box-and-whiskers plots, one of the most valuable aspects of this display was not 
explained most of textbooks. 

Outliers. One of the hallmarks of EDA is paying attention to “unusual” values, which are 
sometimes called outliers, outside values, extreme scores, or rogue values. The extent to which 
textbook authors addressed the issue of outliers was also assessed, regardless of whether outliers 
were discussed in the context of EDA in the Tukey-tradition. Surprisingly, the topic of outliers 
in a single distribution was only mentioned in 12 textbooks, and in only 5 if these textbooks did 
the authors present one or more strategies for identifying these extreme values. (In 4 of the 5 
texts, Tukey’s procedure for identifying outside values was presented). The issue of how to 
actually think about outliers — in terms of their possible causes or in terms of how to handle 
them — ^was rarely discussed. 

In order to get a rough sense of the extent to which the topic of outliers was addressed in 
more complex situations, the ways in which textbook authors addressed the possible effect of 
outliers on the Pearson Product moment correlation coefficient (r^) was also evaluated. This 

correlation coefficient was chosen because EDA emphasizes graphics, and because the 
underlying relationship between two quantitative variables can be easily inspected using a scatter 
diagram. The correlation coefficient was covered in all 20 textbooks. But in half of the texts. 
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the topic of the possible effects of one or more extreme scores on r^was not mentioned. And in 

most of the remaining textbooks, the issue of how to actually think about them — in terms of their 
possible causes or in terms of how to handle them — was again rarely discussed. 

EDA-Influenced Statistics Textbooks 

Textbooks with some EDA philosophy and a number of EDA techniques. There were five 
textbooks that were the “best of the lot” in terms of the way in which they covered EDA: 

Abrami, Cholmsky, and Gordon (2001); Hinkle, Wiersma, and Jurs (1998); Howell (2002); 
McCall (2001); and Runyon, Coleman, and Pittenger (2000). Three of these texts had entire 
chapters devoted to exploring data: (a) Howell; (b) McCall; and (c) Runyon, Coleman, and 
Pittenger). (Howell’s chapter was entitled “Describing and Exploring Data”, and wasn’t strictly 
limited to describing EDA in the Tukey-tradition). 

These five textbooks differed from the remaining texts not only in terms of the number of 
techniques covered, but also, more importantly, in terms of the depth of their coverage. All five 
textbooks included the stem-and-leaf display, and four of the five included presentations of back- 
to-back displays for comparing the distributions of two groups. Three of these texts were 
particularly notable in terms of how to use this display as a data analytic tool: Howell (2002); 
McCall (2001); and Runyon, Coleman, and Pittenger (2001). In all five of these textbooks, the 
authors also presented the box-and-whiskers plot, and in all five of these textbooks, outside 
values were included in the plots. (Abrami, Cholmsky, and Gordon (2001), however, didn’t 
explain how these outside values were identified). Several of these texts were particularly 
detailed in terms of how to actually analyze data using these plots (see Abrami, Cholmsky, and 
Gordon 2001; Hinkle, Wiersma, and Jurs, 1998; and Howell, 2002). 

In terms of the issue identifying possible outside values, all five textbooks covered the 
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topic of outliers for a single distribution. And in two of these texts (Abrami, Cholmsky, and 
Gordon, 2001; and Hinkle, Wiersma, and Jurs, 1998) the authors provided a particularly detailed 
discussion on how to think about outside values. Moreover, in terms of bivariate relationships, 
three of the five texts included discussions on the possible effect of outliers on the correlation 
coefficient, and two of these texts provided “rules of thumb” based on the analysis of 
standardized residuals for identifying possible outliers. 

Howell’s (2002) text was the most notable in terms of integrating graphical displays 
throughout the text. For example, Howell included a box-and-whiskers plot and stem-and-leaf 
display to accompany his one-sample /-test example (p. 187) and matched-pair /-test example (p. 
193). He included a box-and-whisker plot to accompany his one-factor ANOVA example (p. 
334). Lastly, he included stem-and-leaf displays to accompany his multiple linear regression 
example (p. 537). 

Summary and Recommendations 

What impact has Tukey’s writings on EDA had on the content of introductory statistics 
textbooks in the behavioral sciences? For better or worse, in this study, Tukey’s name was most 
often cited in terms of one or more specific EDA techniques— most typically, the stem-and-leaf 
display, and, to a lesser extent, the box-and-whiskers plot. EDA philosophy, by contrast, was not 
covered in the vast majority of these textbooks. Only three textbooks had a chapter devoted to 
EDA, and none of the remaining texts even had a section devoted to this topic. In terms of 
integrating EDA principles throughout the text, only Howell (2002) included stem-and-leaf 
displays and box-and-whisker plots as part of his illustrations of various significance tests. 

Why isn’t EDA already routinely integrated into statistics textbooks in the behavioral 
sciences? Although there is no empirical evidence to answer this question, several explanations 
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seem plausible. First, some statistics textbook authors might not be familiar with EDA, and 
therefore might not see the value in it. Along these same lines, they might think EDA is 
“fudging” the data. Second, some statistics textbook authors might be driven by market demand 
to write a one-semester statistics textbooks that focuses mainly on “old standard” statistical 
significance tests. In fact, there probably is not enough time in a typical one-semester course to 
cover EDA principles and procedures and to cover advanced statistical CDA procedures such as 
higher-order ANOVA, so some trade-offs would need to be made. But in terms of modeling 
“best practices” in data analysis, it makes more sense to include EDA in an introductory class, 
and save advanced CDA procedures for more advanced classes. 

Final Recommendations 

1 . Introductory statistics classes should have a unit devoted to EDA. The EDA unit 
should appear at the beginning of a course, since EDA typically precedes CDA, and address the 
basic philosophy and techniques of EDA. 

2. In discussing hypothesis generation and model-building with EDA, the issue of cross- 
validation must also be addressed. The topic of cross-validation was not addressed in any of the 
20 textbooks considered here, even those textbooks that specifically discussed EDA. This is 
problematic. Although EDA can be a powerful means of developing hypotheses and building 
models, CDA should not be conducted on the same data set that was used for model building, 
because doing so will lead to an inflated Type I error rate. To actually test hypotheses based on 
discoveries from EDA, a second data set is needed. This two-stage process— known as cross- 
validation— requires relatively large data sets, and needs to be considered at the design-stage of a 
study. 

3. Journal editors should clearly convey the message that it is not only acceptable to 
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explore data, but desirable. As Behrens (1997) stated, . all published and initial work should 
be explored. The field would greatly benefit if all published reports included the statement ‘we 
examined the data in detail and found the patterns underlying the summary statistics were not 
obviously pathological’” (p. 154). 

For too long, the process of data analysis in psychological research has been viewed 
strictly in terms of a confirmatory model. Hopefully, the findings of this study will serve as a 
springboard for future discussions about the value of EDA in the behavioral sciences as well as 
the way in which we should be training students to think about the data analytic process. 
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Table 1 

Frequency of Various EDA-Related Indicators (N = 20 textbooks) 



Number of texts in which ... 



Number 



FDA Philosophy 

one or more EDA publications by Tukey or his colleagues are cited as a reference: 7 

the philosophy of EDA is covered: 5 

an entire chapter is devoted to EDA: 3 

the term “confirmatory data analysis” is used: 0 

EDA Techniques - Graphs 

a stem-and-leaf display for one or more groups is presented: 15 

the origin of the stem-and-leaf display is attributed to Tukey: 10 

a box-and- whiskers plot for one or more groups is presented: 9 

a box-and- whiskers with “outside values” plotted outside the “whiskers” is presented: 7 

the origin of the box plot is attributed to Tukey: 7 

a graphical display for the data associated with an illustration of a t-test is presented: 1 

students are asked to construct a graph for at least one end-of-chapter t-test assignment 1 

FDA Techniques- Outliers 

outliers are mentioned in terms of a single distribution: 12 

the possible of effects of outliers on Ty^is discussed: 10 

Tukey ’s method for identifying “outside values” is presented: 4 

FDA Techniques-R esistant Indicators 

Tukey ’s five-number summary (min, Qj, Qj, Q 3 , max) is presented: 4 

Tukey’s notion of “resistant indicators” is mentioned: 2 

EDA Techniques- Smoothing technique 

nonlinear transformations for a single quantitative variable are presented: 3 

smoothing techniques for a single quantitative variable is presented: 0 

FDA Techniques-B ivariate Procedures 

the ‘Tukey line” for the relationship between two quantitative variables is presented: 0 
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